Properly Acting under Partial Observability with Action Feasibility Constraints

نویسندگان

Caroline Ponzoni Carvalho Chanel

Florent Teichteil-Königsbuch

چکیده

We introduce Action-Constrained Partially Observable Markov Decision Process (AC-POMDP), which arose from studying critical robotic applications with damaging actions. AC-POMDPs restrict the optimized policy to only apply feasible actions: each action is feasible in a subset of the state space, and the agent can observe the set of applicable actions in the current hidden state, in addition to standard observations. We present optimality equations for AC-POMDPs, which imply to operate on α-vectors defined over many different belief subspaces. We propose an algorithm named PreCondition Value Iteration (PCVI), which fully exploits this specific property of AC-POMDPs about α-vectors. We also designed a relaxed version of PCVI whose complexity is exponentially smaller than PCVI. Experimental results on POMDP robotic benchmarks with action feasibility constraints exhibit the benefits of explicitly exploiting the semantic richness of action-feasibility observations in AC-POMDPs over equivalent but unstructured POMDPs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reasoning about Strategies under Partial Observability and Fairness Constraints

A number of extensions exist for Alternating-time Temporal Logic; some of these mix strategies and partial observability but, to the best of our knowledge, no work provides a unified framework for strategies, partial observability and fairness constraints. In this paper we propose ATLKF po, a logic mixing strategies under partial observability and epistemic properties of agents in a system with...

متن کامل

0 90 6 . 02 15 v 2 [ m at h . O C ] 1 7 Ju l 2 00 9 Computational Analysis of Control Systems Using Dynamic Optimization ∗

Several concepts on the measure of observability, reachability, and robustness are defined and illustrated for both linear and nonlinear control systems. Defined by using computational dynamic optimization, these concepts are applicable to a wide spectrum of problems. Some questions addressed include the observability based on userinformation, the determination of strong observability vs. weak ...

متن کامل

un 2 00 9 Computational Analysis of Control Systems Using Dynamic Optimization ∗

متن کامل

Planning with Nondeterministic Actions and Sensing

Many planning problems involve nondeterministic actions actions whose effects are not completely determined by the state of the world before the action is executed. In this paper we consider the computational complexity of planning in domains where such actions are available. We give a formal model of nondeterministic actions and sensing, together with an action language for specifying planning...

متن کامل

Policy Learning with Hypothesis based Local Action Selection

For robots to be effective in human environments, they should be capable of successful task execution in unstructured environments. Of these, many task oriented manipulation behaviors executed by robots rely on model based grasping strategies and model based strategies require accurate object detection and pose estimation. Both these tasks are hard in human environment, since human environments...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Properly Acting under Partial Observability with Action Feasibility Constraints

نویسندگان

چکیده

منابع مشابه

Reasoning about Strategies under Partial Observability and Fairness Constraints

0 90 6 . 02 15 v 2 [ m at h . O C ] 1 7 Ju l 2 00 9 Computational Analysis of Control Systems Using Dynamic Optimization ∗

un 2 00 9 Computational Analysis of Control Systems Using Dynamic Optimization ∗

Planning with Nondeterministic Actions and Sensing

Policy Learning with Hypothesis based Local Action Selection

عنوان ژورنال:

اشتراک گذاری